Improve issue #71: Optimize nested structure decoding performance #73
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
When decoding structs with data nested inside two or more layers of slices or maps, the decoder exhibited exponential performance degradation based on the number of values.
Example Structure
Performance Before Fix
The performance degradation was exponential, making the decoder unusable for real-world nested data.
Root Cause
The
findAlias()function performed a linear O(n) search through thedataMapslice for every alias lookup. With deeply nested structures, this function was called thousands or millions of times, resulting in O(n²) or worse complexity.For example, with 1000 nested elements, the parser would:
foos, 1 forfoos[0].bars, 1000 forfoos[0].bars[N].lookup)findAlias()many times during parsing and decodingfindAlias()call would iterate through the entire dataMap linearlySolution
Replaced the linear search with a hash map lookup (O(1)):
aliasMap map[string]*recursiveDatafield to thedecoderstructparseMapData()to populate the map as aliases are createdfindAlias()to use the map instead of iterating through the sliceCode Changes
decoder.go:
aliasMapfield todecoderstruct for O(1) lookupsparseMapData()recursiveDataentriesfindAlias()to use map lookup instead of linear searchdecoder_test.go:
Test infrastructure (test-only, not in production binary):
race_test.go/norace_test.go: Detect race detector to adjust performance thresholdsPerformance After Fix
Without race detector (local development):
With race detector (CI environment):
The optimization provides a ~100x speedup for nested structures with hundreds of elements.
Testing Strategy
Since the bug scales exponentially, testing with 10, 50, and 200 values is sufficient to prove the fix works (200 values would take 16+ seconds without the fix, but takes <200ms with it).
The test uses build tags to detect if the race detector is enabled:
-race: Strict thresholds for fast local feedback-race: Lenient thresholds accounting for 5-10x race detector overheadThis ensures tests pass reliably on CI while still catching performance regressions.
Impact
Verification
All CI checks pass:
Tested locally on:
Does not fully fix #71, but brings a significant improvement.